Speech restoration based on deep learning autoencoder with layer-wised pretraining

نویسندگان

Xugang Lu

Shigeki Matsuda

Chiori Hori

Hideki Kashioka

چکیده

Neural network can be used to “remember” speech patterns by encoding speech statistical regularity in network parameters. Clean speech can be “recalled” when noisy speech is input to the network. Adding more hidden layers can increase network capacity. But when the hidden layer size increases (deep network), the network is easily to be trapped to a local solution when traditional training strategy is used. Therefore, the performance of using a deep network sometimes is even worse than using a shallow network. In this study, we explore the greedy layer-wised pretraining strategy to train a deep autoencoder (DAE) for speech restoration, and apply the restored speech for noisy robust speech recognition. The DAE is first pretrained using quasi-Newton optimization algorithm layer by layer in which each layer is regarded as a shallow autoencoder. And the output of the preceding layer is served as the input to the next layer. The pretrained layers are stacked and “unrolled” to be a DAE. The pretrained parameters are served as initial parameters of the DAE which are used to refine training. The trained DAE is used as a filter for speech restoration when noisy speech is given. Noisy robust speech recognition experiments were done to examine the performance of the trained deep network. Experimental results show that the DAE trained with pretraining process significantly improved the performance of speech restoration from noisy input.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech enhancement based on deep denoising autoencoder

We previously have applied deep autoencoder (DAE) for noise reduction and speech enhancement. However, the DAE was trained using only clean speech. In this study, by using noisyclean training pairs, we further introduce a denoising process in learning the DAE. In training the DAE, we still adopt greedy layer-wised pretraining plus fine tuning strategy. In pretraining, each layer is trained as a...

متن کامل

Multi-pretrained Deep Neural Network

Pretraining is widely used in deep neutral network and one of the most famous pretraining models is Deep Belief Network (DBN). The optimization formulas are different during the pretraining process for different pretraining models. In this paper, we pretrained deep neutral network by different pretraining models and hence investigated the difference between DBN and Stacked Denoising Autoencoder...

متن کامل

Deep Bottleneck Classifiers in Supervised Dimension Reduction

Deep autoencoder networks have successfully been applied in unsupervised dimension reduction. The autoencoder has a "bottleneck" middle layer of only a few hidden units, which gives a low dimensional representation for the data when the full network is trained to minimize reconstruction error. We propose using a deep bottlenecked neural network in supervised dimension reduction. Instead of tryi...

متن کامل

Advances in Deep Learning

Deep neural networks have become increasingly more popular under the name of deep learning recently due to their success in challenging machine learning tasks. Although the popularity is mainly due to the recent successes, the history of neural networks goes as far back as 1958 when Rosenblatt presented a perceptron learning algorithm. Since then, various kinds of artificial neural networks hav...

متن کامل

Convergence rates for pretraining and dropout: Guiding learning parameters using network structure

Unsupervised pretraining and dropout have been well studied, especially with respect to regularization and output consistency. However, our understanding about the explicit convergence rates of the parameter estimates, and their dependence on the learning (like denoising and dropout rate) and structural (like depth and layer lengths) aspects of the network is less mature. An interesting questio...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Speech restoration based on deep learning autoencoder with layer-wised pretraining

نویسندگان

چکیده

منابع مشابه

Speech enhancement based on deep denoising autoencoder

Multi-pretrained Deep Neural Network

Deep Bottleneck Classifiers in Supervised Dimension Reduction

Advances in Deep Learning

Convergence rates for pretraining and dropout: Guiding learning parameters using network structure

عنوان ژورنال:

اشتراک گذاری